Used Cars Prediction Analysis¶

There are numerous aspects to take into consideration while purchasing a car – the main being should you buy a new or a used car. If you are trying to manage your finances wisely, opting for a pre-owned car would be a wise decision. Though the idea of purchasing a new car may sound tempting, the quick rate of depreciation, higher price, and greater insurance, among others, do not work in the favor of new cars.

Importing necessary libraries¶

Data Preprocessing¶

Unnamed: 0 Name Location Year Kilometers_Driven Fuel_Type Transmission Owner_Type Mileage Engine Power Seats New_Price
0 0 Maruti Alto K10 LXI CNG Delhi 2014 40929 CNG Manual First 32.26 km/kg 998 CC 58.2 bhp 4.0 NaN
1 1 Maruti Alto 800 2016-2019 LXI Coimbatore 2013 54493 Petrol Manual Second 24.7 kmpl 796 CC 47.3 bhp 5.0 NaN
2 2 Toyota Innova Crysta Touring Sport 2.4 MT Mumbai 2017 34000 Diesel Manual First 13.68 kmpl 2393 CC 147.8 bhp 7.0 25.27 Lakh
3 3 Toyota Etios Liva GD Hyderabad 2012 139000 Diesel Manual First 23.59 kmpl 1364 CC null bhp 5.0 NaN
4 4 Hyundai i20 Magna Mumbai 2014 29000 Petrol Manual First 18.5 kmpl 1197 CC 82.85 bhp 5.0 NaN
Unnamed: 0 Name Location Year Kilometers_Driven Fuel_Type Transmission Owner_Type Mileage Engine Power Seats New_Price Price
0 0 Maruti Wagon R LXI CNG Mumbai 2010 72000 CNG Manual First 26.6 km/kg 998 CC 58.16 bhp 5.0 NaN 1.75
1 1 Hyundai Creta 1.6 CRDi SX Option Pune 2015 41000 Diesel Manual First 19.67 kmpl 1582 CC 126.2 bhp 5.0 NaN 12.50
2 2 Honda Jazz V Chennai 2011 46000 Petrol Manual First 18.2 kmpl 1199 CC 88.7 bhp 5.0 8.61 Lakh 4.50
3 3 Maruti Ertiga VDI Chennai 2012 87000 Diesel Manual First 20.77 kmpl 1248 CC 88.76 bhp 7.0 NaN 6.00
4 4 Audi A4 New 2.0 TDI Multitronic Coimbatore 2013 40670 Diesel Automatic Second 15.2 kmpl 1968 CC 140.8 bhp 5.0 NaN 17.74
Unnamed: 0             int64
Name                  object
Location              object
Year                   int64
Kilometers_Driven      int64
Fuel_Type             object
Transmission          object
Owner_Type            object
Mileage               object
Engine                object
Power                 object
Seats                float64
New_Price             object
Price                float64
dtype: object
Unnamed: 0             int64
Name                  object
Location              object
Year                   int64
Kilometers_Driven      int64
Fuel_Type             object
Transmission          object
Owner_Type            object
Mileage               object
Engine                object
Power                 object
Seats                float64
New_Price             object
dtype: object
Rows in dataset are : 6019 
Columns in dataset are : 14
Rows in test dataset are : 1234 
Columns in test dataset are : 13
Unnamed: 0 Year Kilometers_Driven Seats Price
count 6019.000000 6019.000000 6.019000e+03 5977.000000 6019.000000
mean 3009.000000 2013.358199 5.873838e+04 5.278735 9.479468
std 1737.679967 3.269742 9.126884e+04 0.808840 11.187917
min 0.000000 1998.000000 1.710000e+02 0.000000 0.440000
25% 1504.500000 2011.000000 3.400000e+04 5.000000 3.500000
50% 3009.000000 2014.000000 5.300000e+04 5.000000 5.640000
75% 4513.500000 2016.000000 7.300000e+04 5.000000 9.950000
max 6018.000000 2019.000000 6.500000e+06 10.000000 160.000000
Unnamed: 0 Year Kilometers_Driven Seats
count 1234.000000 1234.000000 1234.000000 1223.000000
mean 616.500000 2013.400324 58507.288493 5.284546
std 356.369424 3.179700 35598.702098 0.825622
min 0.000000 1996.000000 1000.000000 2.000000
25% 308.250000 2011.000000 34000.000000 5.000000
50% 616.500000 2014.000000 54572.500000 5.000000
75% 924.750000 2016.000000 75000.000000 5.000000
max 1233.000000 2019.000000 350000.000000 10.000000
Unnamed: 0              0
Name                    0
Location                0
Year                    0
Kilometers_Driven       0
Fuel_Type               0
Transmission            0
Owner_Type              0
Mileage                 2
Engine                 36
Power                  36
Seats                  42
New_Price            5195
Price                   0
dtype: int64
Missing values in first list: {'Tata Tiago AMT 1.2 Revotron XTA', 'Tata Tiago 1.05 Revotorq XT Option', 'Maruti Ciaz VDi Option SHVS', 'Toyota Land Cruiser Prado VX L', 'Ford EcoSport 1.5 Petrol Ambiente', 'Hyundai Elite i20 Magna Plus', 'Mercedes-Benz E-Class 250 D W 124', 'Toyota Camry MT with Moonroof', 'Chevrolet Sail Hatchback 1.2', 'Toyota Etios Cross 1.2L G', 'Maruti Celerio X VXI Option', 'Mahindra Scorpio SLX 2.6 Turbo 8 Str', 'Maruti Ertiga VXI Petrol', 'Volkswagen Vento 1.2 TSI Comfortline AT', 'Jaguar XF 2.0 Petrol Portfolio', 'Mahindra Scorpio VLS 2.2 mHawk', 'Renault Duster 85PS Diesel RxZ', 'Mercedes-Benz B Class B180 Sports', 'Mercedes-Benz B Class B180 Sport', 'Skoda Laura L and K MT', 'Hyundai Elantra SX AT', 'Mahindra Scorpio VLX Special Edition BS-IV', 'Tata Indica Vista Quadrajet LX', 'Hyundai Elantra GT', 'Maruti Vitara Brezza ZDi AMT', 'Skoda Laura 1.8 TSI Ambition', 'Toyota Innova 2.5 GX 8 STR', 'Honda Mobilio V i VTEC', 'Tata Indica Vista Terra Quadrajet 1.3L BS IV', 'Skoda Laura 1.9 TDI MT Elegance', 'Audi Q5 2008-2012 3.0 TDI Quattro', 'Toyota Etios Liva 1.4 VXD', 'Hindustan Motors Contessa 2.0 DSL', 'Maruti SX4 ZXI AT', 'Hyundai i20 Active SX Diesel', 'Hyundai Sonata Embera 2.4L MT', 'Fiat Punto EVO 1.3 Emotion', 'Hyundai i20 new Sportz AT 1.4', 'OpelCorsa 1.4Gsi', 'Nissan Micra XL CVT', 'Honda Jazz VX CVT', 'Skoda Rapid Ultima 1.6 TDI Ambition Plus', 'BMW 5 Series 530i Sport Line', 'Volvo S60 D5 Kinetic', 'Fiat Linea Dynamic', 'Hyundai Xcent 1.2 CRDi SX', 'Mahindra Xylo H9', 'Mahindra Verito Vibe 1.5 dCi D6', 'Maruti Vitara Brezza ZDi Plus AMT', 'Honda CR-V Diesel', 'Honda BRV i-DTEC V MT', 'Maruti Versa DX2', 'Maruti Ignis 1.2 AMT Delta', 'Toyota Innova 2.5 LE 2014 Diesel 8 Seater', 'Skoda Octavia 2.0 TDI MT Style', 'Ford Endeavour 3.0L AT 4x2', 'Honda BR-V i-DTEC S MT', 'Mahindra Bolero SLX', 'Nissan 370Z AT', 'Honda Civic 2010-2013 1.8 V AT', 'Tata Manza Club Class Safire90 LX', 'Toyota Etios Liva VD', 'Ford Freestyle Titanium Plus Diesel', 'Mercedes-Benz GLA Class 220 d 4MATIC', 'Hyundai Verna Transform VTVT with Audio', 'Toyota Innova Crysta Touring Sport 2.4 MT', 'Mahindra Xylo E9', 'Maruti Swift VVT ZXI', 'Ford Ikon 1.4 ZXi', 'Chevrolet Enjoy 1.4 LTZ 8', 'Skoda Superb Petrol Ambition', 'Fiat Punto 1.4 Emotion', 'Mahindra KUV 100 mFALCON G80 K4 5str', 'BMW 5 Series 520d Sedan', 'Mahindra TUV 300 2015-2019 T8 AMT', 'Hyundai i20 2015-2017 1.4 CRDi Sportz', 'Hyundai Creta 1.6 SX Diesel', 'BMW 3 Series GT 320d Sport Line', 'Hyundai Verna Transform SX VGT CRDi BS III', 'Nissan Teana XL', 'Chevrolet Spark 1.0 PS', 'Fiat Avventura Urban Cross 1.3 Multijet Emotion', 'Fiat Linea Classic 1.3 Multijet', 'Hyundai EON 1.0 Kappa Magna Plus', 'Maruti Swift AMT ZXI', 'Honda Amaze VX CVT i-VTEC', 'Tata Indica Vista Aqua TDI BSIII', 'Mercedes-Benz E-Class E240 V6 AT', 'Bentley Flying Spur W12', 'BMW 7 Series 730Ld DPE Signature', 'Hyundai Tucson 2.0 e-VGT 4WD AT GLS', 'Mahindra Scorpio S10 8 Seater', 'Hyundai Accent Executive LPG', 'Ford Classic 1.4 Duratorq CLXI', 'Hyundai Verna 1.4 CX', 'Volkswagen CrossPolo 1.2 TDI', 'Honda Civic 2010-2013 1.8 S MT Inspire', 'Mahindra Scorpio VLX 2WD BSIII', 'Toyota Innova 2.0 V', 'Chevrolet Enjoy 1.3 TCDi LTZ 7', 'Fiat Abarth 595 Competizione', 'Tata Tigor 1.2 Revotron XZ Option', 'Renault Koleos 4X2 MT', 'Land Rover Range Rover HSE', 'Hyundai Creta 1.6 VTVT Base', 'Tata Indica Vista Aqua 1.2 Safire', 'Hyundai Santro Xing GLS CNG', 'Jeep Compass 1.4 Sport', 'Chevrolet Enjoy Petrol LTZ 7 Seater', 'Honda Jazz 2020 Petrol', 'Mahindra Bolero Power Plus ZLX', 'Fiat Grande Punto 1.2 Emotion', 'Maruti Ritz VDi ABS', 'Ford Fiesta Classic 1.6 Duratec LXI', 'Maruti 800 DX', 'Mahindra Thar 4X4', 'BMW X3 2.5si', 'Honda Amaze E i-DTEC', 'Honda City i DTec VX Option BL', 'Mahindra KUV 100 mFALCON D75 K6 5str AW', 'Volkswagen Polo ALLSTAR 1.2 MPI', 'Hyundai Accent GLX', 'Chevrolet Enjoy TCDi LS 7 Seater', 'Audi Q3 30 TDI S Edition', 'Land Rover Discovery 4 SDV6 SE', 'BMW 7 Series 740i Sedan', 'Volkswagen Vento 1.6 Trendline', 'Hyundai Santro Xing XG AT eRLX Euro III', 'Hyundai Creta 1.6 SX Automatic', 'Renault Lodgy 110PS RxL', 'Honda WRV i-DTEC VX', 'Ford Fiesta Classic 1.6 SXI Duratec', 'Maruti Swift 1.3 VXi', 'Fiat Avventura FIRE Dynamic', 'Mercedes-Benz CLA 45 AMG', 'Renault Pulse RxZ', 'Maruti A-Star Zxi', 'Tata Indica V2 DiCOR DLG BS-III', 'Mahindra KUV 100 D75 K8 5Str', 'Hyundai Santro LS zipDrive Euro I', 'Tata Indica Vista Terra 1.2 Safire BS IV', 'Volkswagen Vento 1.5 TDI Highline Plus AT', 'Toyota Etios Liva Diesel TRD Sportivo', 'Mercedes-Benz S Class 2005 2013 320 L', 'Volkswagen Jetta 2007-2011 1.6 Trendline', 'Mahindra KUV 100 mFALCON D75 K2', 'Hyundai EON 1.0 Era Plus', 'Mahindra TUV 300 P4', 'Honda Accord 2001-2003 2.3 VTI L MT', 'Land Rover Discovery 4 TDV6 Auto Diesel', 'Volkswagen Vento 1.5 TDI Highline Plus', 'Ford Fiesta 1.4 SXI Duratorq', 'Toyota Corolla Altis GL', 'Hyundai i20 1.4 Asta AT (O) with Sunroof', 'Mitsubishi Pajero Sport 4X2 AT', 'Maruti Alto XCITE', 'Hyundai i20 2015-2017 Magna Optional 1.4 CRDi', 'Maruti Ciaz VXi', 'Maruti Wagon R VXI AMT Opt', 'Nissan Terrano XE 85 PS', 'Isuzu MU 7 4x2 HIPACK', 'Honda City ZX VTEC Plus', 'Mahindra KUV 100 G80 K4 Plus 5Str', 'Mercedes-Benz A Class Edition 1', 'Datsun GO T Petrol', 'Tata Sumo EX 10/7 Str BSII', 'Land Rover Freelander 2 S Business Edition', 'Honda BR-V i-VTEC VX MT'}
False
Missing values in first list: {'Bentley Flying', 'OpelCorsa 1.4Gsi', 'Fiat Abarth', 'Hindustan Motors', 'Isuzu MU', 'Toyota Land', 'Nissan 370Z'}
Missing values in first list: set()
Unnamed: 0           0
Name                 0
Location             0
Year                 0
Kilometers_Driven    0
Fuel_Type            0
Transmission         0
Owner_Type           0
Mileage              0
Engine               0
Power                0
Seats                0
Price                0
Cars                 0
dtype: int64
Unnamed: 0           0
Name                 0
Location             0
Year                 0
Kilometers_Driven    0
Fuel_Type            0
Transmission         0
Owner_Type           0
Mileage              0
Engine               0
Power                0
Seats                0
Price                0
Cars                 0
dtype: int64
Name                 0
Location             0
Year                 0
Kilometers_Driven    0
Fuel_Type            0
Transmission         0
Owner_Type           0
Mileage              0
Engine               0
Power                0
Seats                0
Cars                 0
dtype: int64
Name                  object
Location              object
Year                   int64
Kilometers_Driven      int64
Fuel_Type             object
Transmission          object
Owner_Type            object
Mileage              float64
Engine               float64
Power                float64
Seats                float64
Cars                  object
dtype: object

Data Analysis¶

Price                1.000000
Power                0.769351
Engine               0.659117
Year                 0.305800
Seats                0.052262
Kilometers_Driven   -0.011263
Mileage             -0.313877
Name: Price, dtype: float64

Conclusion- According to the stats for choosing a necessary and optimum used car, a customer will prefer price at the first place and mileage at the last accordingly. An average typical second hand car customer prefers decent price at its first place.

[<matplotlib.lines.Line2D at 0x19304e49d60>]

Conclusion- Converted the value of Price to Log(Price) for a good solution to have a more normal visualization of the distribution of the Price.

72.3%27.3%0.347%0.0438%
DieselPetrolCNGLPG
plotly-logomark

Conclusion- The above pie chart indicates the price of particular fuel engines(diesel, petrol, CNG, LPG) Also it indicates that the market price of diesel engines is more as compared to other fuel type engines. Also diesel users are greater in market compared to others as it gives better mileage.

Conclusion- According to the plot, the customers using automatic transmission mode vehicles are increasing rapidly in consecutive years.

Conclusion- According to the plot, the customers using diesel driven vehicles are increasing rapidly in consecutive years.

CNGDieselPetrolLPG020406080100120140160
TransmissionManualAutomaticFuel_TypePrice
plotly-logomark

Conclusion- From graph it is clear that in CNG and LPG driven cars only manual mode of transmission is available whereas automatic mode of transmission leads in diesel and petrol driven cars(disesel being the most used).

Conclusion- The graph clearly indicates that people prefer Manual mode of Transmission over Automatic one

array(['Mumbai', 'Pune', 'Chennai', 'Coimbatore', 'Hyderabad', 'Jaipur',
       'Kochi', 'Kolkata', 'Delhi', 'Bangalore', 'Ahmedabad'],
      dtype=object)
array(['Maruti Wagon', 'Hyundai Creta', 'Honda Jazz', 'Maruti Ertiga',
       'Audi A4', 'Hyundai EON', 'Nissan Micra', 'Toyota Innova',
       'Volkswagen Vento', 'Tata Indica', 'Maruti Ciaz', 'Honda City',
       'Maruti Swift', 'Land Rover', 'Mitsubishi Pajero', 'Honda Amaze',
       'Renault Duster', 'Mercedes-Benz New', 'BMW 3', 'Maruti S',
       'Audi A6', 'Hyundai i20', 'Maruti Alto', 'Honda WRV',
       'Toyota Corolla', 'Mahindra Ssangyong', 'Maruti Vitara',
       'Mahindra KUV', 'Mercedes-Benz M-Class', 'Volkswagen Polo',
       'Tata Nano', 'Hyundai Elantra', 'Hyundai Xcent', 'Mahindra Thar',
       'Hyundai Grand', 'Renault KWID', 'Hyundai i10', 'Nissan X-Trail',
       'Maruti Zen', 'Ford Figo', 'Mercedes-Benz C-Class',
       'Porsche Cayenne', 'Mahindra XUV500', 'Nissan Terrano',
       'Honda Brio', 'Ford Fiesta', 'Hyundai Santro', 'Tata Zest',
       'Maruti Ritz', 'BMW 5', 'Toyota Fortuner', 'Ford Ecosport',
       'Hyundai Verna', 'Datsun GO', 'Maruti Omni', 'Toyota Etios',
       'Jaguar XF', 'Maruti Eeco', 'Honda Civic', 'Volvo V40',
       'Mercedes-Benz B', 'Mahindra Scorpio', 'Honda CR-V',
       'Mercedes-Benz SLC', 'BMW 1', 'Chevrolet Beat', 'Skoda Rapid',
       'Audi RS5', 'Mercedes-Benz S', 'Skoda Superb', 'BMW X5',
       'Mercedes-Benz GLC', 'Mini Countryman', 'Chevrolet Optra',
       'Renault Lodgy', 'Mercedes-Benz E-Class', 'Maruti Baleno',
       'Skoda Laura', 'Mahindra NuvoSport', 'Skoda Fabia', 'Tata Indigo',
       'Audi Q3', 'Skoda Octavia', 'Audi A8', 'Mahindra Verito',
       'Mini Cooper', 'Hyundai Santa', 'BMW X1', 'Hyundai Accent',
       'Hyundai Tucson', 'Mercedes-Benz GLE', 'Maruti A-Star',
       'Fiat Grande', 'BMW X3', 'Ford EcoSport', 'Audi Q7',
       'Volkswagen Jetta', 'Mercedes-Benz GLA', 'Maruti Celerio',
       'Tata Sumo', 'Honda Accord', 'BMW 6', 'Tata Manza',
       'Chevrolet Spark', 'Mini Clubman', 'Nissan Teana', 'Maruti 800',
       'Honda BRV', 'Jaguar XE', 'Tata Xenon', 'Audi A3',
       'Mercedes-Benz GL-Class', 'Honda BR-V', 'Volvo S80',
       'Renault Captur', 'Chevrolet Enjoy', 'Mahindra Bolero', 'Audi Q5',
       'Mitsubishi Cedia', 'Maruti S-Cross', 'Skoda Yeti',
       'Ford Endeavour', 'Mercedes-Benz GLS', 'Mercedes-Benz A',
       'Maruti SX4', 'Toyota Camry', 'Honda Mobilio', 'Fiat Linea',
       'Audi TT', 'Mahindra Renault', 'Jeep Compass', 'Ford Ikon',
       'Chevrolet Sail', 'Mahindra Quanto', 'Chevrolet Aveo',
       'Mahindra Xylo', 'Maruti Esteem', 'Tata Safari', 'Maruti Ignis',
       'Jaguar XJ', 'Nissan Sunny', 'Mercedes-Benz SLK-Class',
       'Volkswagen Passat', 'Maruti Dzire', 'Chevrolet Cruze',
       'Renault Koleos', 'Toyota Qualis', 'Volkswagen Ameo',
       'Maruti Grand', 'Datsun redi-GO', 'Smart Fortwo',
       'Mitsubishi Outlander', 'Porsche Cayman', 'Mercedes-Benz CLA',
       'Volvo XC60', 'Tata New', 'Porsche Boxster', 'Mahindra XUV300',
       'Tata Hexa', 'Tata Tiago', 'BMW 7', 'Fiat Avventura', 'Tata Tigor',
       'Volvo S60', 'Ambassador Classic', 'Volkswagen Beetle',
       'Fiat Petra', 'Hyundai Getz', 'Audi A7', 'Hyundai Elite',
       'Ford Aspire', 'Volkswagen Tiguan', 'Chevrolet Captiva',
       'Fiat Punto', 'Mahindra TUV', 'BMW X6', 'Tata Bolt',
       'Nissan Evalia', 'Renault Scala', 'Mahindra Jeep',
       'Hyundai Sonata', 'Ford Freestyle', 'Mahindra Logan',
       'Chevrolet Tavera', 'Volvo XC90', 'Renault Pulse',
       'Mitsubishi Montero', 'Porsche Panamera', 'Volkswagen CrossPolo',
       'Renault Fluence', 'Tata Venture', 'Tata Nexon', 'Isuzu MUX',
       'Toyota Platinum', 'Mercedes-Benz R-Class',
       'Mercedes-Benz CLS-Class', 'ISUZU D-MAX', 'Mercedes-Benz S-Class',
       'Mitsubishi Lancer', 'Ford Classic', 'Datsun Redi', 'Ford Mustang',
       'Ford Fusion', 'Fiat Siena', 'Maruti 1000',
       'Mercedes-Benz SL-Class', 'BMW Z4', 'Force One', 'Maruti Versa',
       'Honda WR-V', 'Bentley Continental', 'Lamborghini Gallardo',
       'Jaguar F'], dtype=object)
Location
Ahmedabad     223
Bangalore     353
Chennai       490
Coimbatore    634
Delhi         549
Hyderabad     741
Jaipur        410
Kochi         648
Kolkata       530
Mumbai        784
Pune          613
Name: Cars, dtype: int64

Observation- From above data, we can observe that Mumbai and Hyderabad has maximum number of second hand car users which is our target audience.

[<matplotlib.lines.Line2D at 0x19307abfa90>]
Location
Ahmedabad           Volvo XC60
Bangalore            Volvo V40
Chennai              Volvo S80
Coimbatore           Volvo S60
Delhi                Volvo S60
Hyderabad           Volvo XC60
Jaipur        Volkswagen Vento
Kochi               Volvo XC90
Kolkata       Volkswagen Vento
Mumbai               Volvo S60
Pune                Volvo XC60
Name: Cars, dtype: object

Conclusion- The above data provides information that the respective corresponding car models is of the highest demand in that particular city. From this result we can conclude that since Hyderabad and Mumbai have the highest number of second hand car users thus the availability of their respective cars ie. Volvo XC60 and Volvo S60 respectively are the target cars with maximum number of selling units. Similarly, the selling units of the corresponding cars is more in their respective locations. Hence it can be the our main selling main point.

Conclusion - This is a powerbi report comparing various columns of the data provided giving clear analysis of relation among themselves.

Model fitting¶

model Root Mean Squared Error Accuracy on Traing set Accuracy on Testing set
3 MLPRegressor 261.569221 0.689665 0.430227
4 AdaBoostRegressor 151.524606 0.824036 0.808797
0 DecisionTreeRegressor 113.436534 0.999993 0.89284
2 RandomForestRegressor 84.35687 0.991785 0.940739
5 ExtraTreesRegressor 79.868857 0.999993 0.946877
1 XGBRegressor 74.815814 0.994635 0.953386
Car_id Price
0 0 155.50
1 1 104.34
2 2 941.14
3 3 164.77
4 4 286.92

Observation- The above model displays prediction of the car price for respective specifications given in the feature1 array.

Prediction by User Input¶

Enter your own data to test the model:
There was an error when executing cell [54]. Please run Voilà with --show_tracebacks=True or --debug to see the error message, or configure VoilaConfiguration.show_tracebacks.